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Abstract 

At the LHC, tagging boosted heavy particle resonances which decay hadronically, such as top 
quarks and Higgs bosons, can play an essential role in new physics searches. In events with high 
multiplicity, however, the standard approach to tag boosted resonances by a large-radius fat jet 
becomes difficult because the resonances are not well-separated from other hard radiation. In this 
paper, we propose a different approach to tag and reconstruct boosted resonances by using the 
recently proposed mass-jump jet algorithm. A key feature of the algorithm is the flexible radius 
of the jets, which results from a terminating veto that prevents the recombination of two hard 
prongs if their combined jet mass is substantially larger than the masses of the separate prongs. 
The idea of collecting jets in “buckets” is also used. As an example, we consider the fully hadronic 
final state of pair-produced vectorlike top partners at the LHC, pp —> TT —»• ttHH , and show 
that the new approach works well. We also show that tagging and kinematic reconstruction of 
boosted top quarks and Higgs bosons are possible with good quality even in these very busy final 
states. The vectorlike top partners are kinematically reconstructed, which allows their direct mass 
measurement. 
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I. INTRODUCTION 


The high-energy frontier of particle physics is often probed at hadronic colliders, such 
as Tevatron and the Large Hadron Collider (LHC). Most of the time, collisions at hadron 
colliders produce coloured partons that undergo showering and hadronization resulting in 
a large number of final-state hadrons. In order to extract momentum, energy, and other 
information of the hard-scattering partons, it is necessary to cluster the hadrons observed 
at the detectors into jets. Studying jets and their clustering procedure (jet algorithm) are 
therefore the keys to understanding the physics at hadron colliders. 

It is particularly interesting to consider the clustering of hadrons originating from boosted 
heavy particle resonances such as top quarks and Higgs bosons, given that the LHC restarts 
with an increased energy A fs = 13-14 TeV and will produce such resonances copiously. The 
construction of large-radius “fat” jets has become a common approach to deal with such 
scenarios, facilitated by progress in tagging of boosted resonances (see e.g. Refs. (DEI and 
references therein). By investigating the substructure of a fat jet, more information on 
the energy deposit pattern is available compared to separately resolved small-radius jets. 
Furthermore, the classical problem of finding an optimal jet radius [3H5] is avoided and jet 
combinatorics are significantly reduced. 

A more challenging situation arises when multiple heavy resonances are produced simul¬ 
taneously. Such processes lead to very busy final states where the heavy particles under con¬ 
sideration (such as top and Higgs) as well as their daughter particles are not well-separated. 
(See Fig. [l]) As a result, fat jets merge in most events, and a majority of the jets contain 
decay products from more than one resonance. Such a scenario is not adequately addressed 
by most tagging algorithms based on (isolated) fat jets. 

In the present work, we suggest a new framework of jet tagging that allows particle re¬ 
construction with good quality compared to traditional methods in such a busy hadronic 
environment. A key ingredient is the “mass-jump” jet clustering algorithm [B], which is an 
extension of the “mass-drop” subjet identification in fat jets as employed in the HEPTop- 
Tagger [7]. It has been shown that the mass-jump algorithm gives competitive performance, 
but now the “sub”jets are formed directly without the definition of an intermediate fat jet. 
Mass-jump clustering harnesses the advantage of fat jet substructure algorithms of resolving 
small jets without reference to a fixed radius. 
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FIG. 1: Angular distances between daughter particles from top and Higgs decays, in the bench¬ 
mark process pp —> TT —> ttHH —> 10 jets at the LHC with \/s = 14TeV, with vectorlike top 
mass rriT = 1 TeV (parton level, arbitrary units). The largest /^-distance between the daughters 
of the same top quark (Higgs boson) is denoted by Ai? max (t) (AR max (H)) and plotted with black 
solid (dashed) lines. The minimal i?-distance between the nearest neighbour daughters coming 


from different mothers Ai?(NN) is depicted by the red line. See Sec. Ill for details. 


The absence of the fat jet, however, reintroduces the issue of large combinatorics in such 
busy environments, since the decay products of the heavy resonances cannot be disentangled 
a priori. To facilitate event reconstruction, the idea of collecting jets into separate “buck¬ 
ets” P, [9] is applied, which allows efficient assignment of jets to their respective resonances. 

The jet-tagging method proposed here is applicable to a broad range of Standard Model 
(SM) and beyond the SM phenomena at hadron colliders. There are indeed important SM 
processes which involve decays of multiple heavy particles, resulting in a busy hadronic en¬ 
vironment. A prime example is the associated production of a Higgs boson with two top 
quarks (pp —> ttH), which has attracted attention as this channel opens up the opportunity 
to measure directly the Higgs-top Yukawa coupling, an essential probe towards understand¬ 
ing the Higgs sector. 

Some models of supersymmetry (SUSY) also predict large multiplicity of jets with little 
or no missing transverse energy (MET). For example, assuming that the gluino is the lightest 
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SUSY particle, it can decay into a top quark and jets when baryonic R-parity associated to 
the third generation quark is violated dS- This leads to a multijet final state when the top 
quark decays hadronically (gg —> ttjjjj). Another example is the stealth SUSY, where the 
top and the lighter stop (t i) masses are almost degenerate, leading to final states without 
significant MET mi. The heavier stop (£ 2 ) has a model-dependent decay branching ratio 
to the Z or Higgs boson (f 2 —> UZ/H). The hadronic mode of this decay again leads to 
multijet final states with little MET (t 2 t 2 “A tt(H/Z)(H/Z)). Our method may allow to 
fully reconstruct the underlying new particles in such models, too. 

In order to illustrate the strength of our jet tagging method, we investigate a simplified 
version of heavy particle production topology in this paper, i.e. we consider the fully hadronic 
final state of pair-produced vectorlike top partners at the LHC (pp —* TT —> ttHH). In 
particular, we study the performance of our taggers at the 14 TeV LHC with a vectorlike 
top of mass around 1 TeV as our benchmark scenario. Studies of fully hadronic final states 
in similar processes have been based on fat jet substructure HM5], including experimental 
searches at 8 TeV by CMS [IB] . Current exclusion bounds on the vectorlikc top mass are 
m T > 700 — 950 from ATLAS [T7l HR] and m T > 690 — 910 from CMS |20Tf22] . depending 
on the assumed branching fractions. 

This paper is arranged as follows: In Section [TT} we review the essential tools used in 
our analysis: the mass-jump clustering and the bucket algorithms. We then apply our 
method to the benchmark scenario of the fully hadronic decay of pair-produced vectorlike 


tops in Section HI The performance of the involved top and Higgs taggers is investigated 
in Section IV We conclude our findings in Section [V] 


II. RECAP: MASS-JUMP JET CLUSTERING AND BUCKETS 

In this paper, we investigate a new approach of analyzing high-multiplicity final states 
based on separately resolved jets. We try to answer the two key questions that arise in such 
an analysis: (1) which algorithm to use to construct the jets, and (2) how to reduce the 
sheer combinatorial choices of assigning the jets to the resonance particles of the process. We 
examine the first question by comparing the mass-jump algorithm [Bj with the corresponding 
jet clustering algorithm of the generalized kx family. The latter question is addressed by 
the bucket algorithm introduced in Refs. [8,91. Both recent techniques are briefly reviewed 
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in the remainder of this section. 


A. Jet clustering with a mass-jump veto 

In the commonly used generalized-fcr family of jet clustering algorithms, jets are con¬ 
structed by sequential recombination of input particles until a certain angular distance 
is reached, the jet radius R. As a result, all jet centres are mutually separated by 
AR(ji,j 2 ) = \j (</>i — 02) 1 2 + ( y\ + y 2 ) 2 > R and the angular spread of each jet is roughly 
< R. (p and y are the azimuthal angle and rapidity, respectively. The choice of a large 
parameter R can lead to radiation from two (or more) hard partons ending up in the same 
jet (splash-in). On the other hand, if the radius is too small, not all final-state radiation of 
a hard parton will be captured by the jet (splash-out). In both cases, jet-parton correspon¬ 
dence is disturbed. 

The mass-jump clustering algorithm [6] addresses this problem by implementing a flexible 
jet radius based on an intrinsic jet property (jet mass) as well as the topology of jets in its 
vicinity. Two parameters are introduced accordingly, the jet mass threshold y and the mass- 
ratio parameter 6. Jet clustering starts from a set of input particles, which are labelled active 
protojets. A distance metric for protojets ji is defined as 

Ai? 2 

d hh = R2 2 min [Pj?±’Ph±] ’ d hB = Py± , (1) 

where n = 1 corresponds to the kx algorithm [23b25] . n — 0 to the Cambridge/Aachen 
algorithm (26UZg, and n = — 1 to the anti -kx algorithm [28] . The sequential recombination 
algorithm then proceeds as follows [6j: 

1. Find the smallest dj a j b among active protojets, including dj a s ; if it is given by a beam 
distance, dj a B , label j a passive and repeat step 1. 

2. Combine j a and jb by summing their four-momenta, Pj a +j b = Pj a + Pj b (A-scheme). If 
the new jet is still light, rrij a+ j b < y, replace j a and jb by their combination in the set 
of active protojets and go back to step 1. 

Otherwise check the mass-jump criterion: If 6 ■ rrij a+ j b > max [m Ja , rrtj J label j a and 
jb passive and go back to step 1. 


5 




3. Mass jumps can also appear between an active and a passive protojet. To examine 
this 

a. Find the passive protojet j n which is closest to j a in terms of the metric d and is 
not isolated, dj a j n < dj n B- 

b. Then check if these two protojets would have been recombined if j n had not been 
rendered passive by a previous veto, i.e. dj a j n < dj a j b . 

c. Finally check the mass-jump criterion, > // and 6 ■ rnj a +j n > 

max [m ja ,m jn ]. 

If all these criteria for the veto are fulfilled, label j a passive. Do the same for j b . If 
either of j a or j b turned passive, go back to step 1. 

4. No mass-jump has been found, so replace j a and j b by their combination in the set of 
active protojets. Go back to step 1. 

Clustering terminates when there are no more active protojets left. Passive protojets are 
then labelled jets. Note that for 0 = 0 or p = oo standard sequential clustering without 
veto is recovered, which is reduced to steps 1 and 4. Jet clustering can be kr- like, C/A-like, 
or anti-fcr-like, depending on the metric chosen [see Eq. 0]- 

In Ref. [6j it has been shown that the mass-jump clustering algorithm can be useful to 
resolve the close-by subjets of boosted top quarks. At the same time, isolated jets are hardly 
affected by the veto if /i and 6 are not chosen too aggressively. These properties qualify the 
mass-jump algorithm as a suitable candidate for processes with very busy final states where 
this flexibility is essential. 

The mass-jump algorithm described above is the first member of the family of jet clus¬ 
tering algorithms with a terminating veto and has been made publicly available as part of 
the FastJet contribution package [23j- The plugin is dubbed ClusteringVetoPlugin and 
accepts any user-defined veto function. Its exemplary usage is illustrated within the package. 

B. The bucket algorithm 

In high-multiplicity events, the assignment of jets to their respective heavy resonances 
can easily get out of hand. There are 6!/(3!3!) = 20 possible combinations to assign six jets 
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to two top quarks, and for eight jets coming from two tops and a Higgs boson this number 
already reaches 8!/(3!3!2!) = 560. The bucket algorithm [8] E] was proposed in the context 
of these two final states and introduces a “bucket” for each top quark (and one additional 
bucket of unassigned jets Bisk), into which the jets are allocated. For a bucket B\ the metric 


A Bi = | m Bi ~ m t 


with 



\jeBi J 


( 2 ) 


measures the similarity of a collection of jets inside the bucket with a top quark. In Refs. [8j 
[9j, the combination is determined by minimizing a global y; 2 -like metric defined as 

A 2 = uA 2 Bi + A| 2 , (3) 


and choosing a large u = 100 effectively decouples the two buckets. Thereby A Bl < A B2 
holds and the problem of unfeasible combinatorics is circumvented because the buckets can 
be filled independently. Top tagging is performed by imposing cuts on each bucket later on. 
Whereas in the original proposal the number of jets inside each bucket is not fixed, in this 
paper we require strictly three jets in each top bucket, and also introduce Higgs buckets 
that contain exactly two jets. 1 

We apply the bucket algorithm in conjunction with mass-jump jet clustering as well 
as conventional jet clustering for comparison. Our benchmark scenario is given by ttHH 
production from a pair of vectorlike tops. Clearly the naive combinatorics are overwhelming 
even for the minimal final-state multiplicity of ten jets: 10!/(3!3!2!2!) = 25200. To tackle 
this problem we formally define a global metric 

A 2 = uqA| tl + w 2 A| t2 + uj 3 A Bhi + lo 4 A Bh2 (4) 


and explicitly decouple the four buckets by choosing the (positive) weights such that 


^i +1 
UJi 


+0 i = 1..3 . 


(5) 


Therefore, the buckets are filled separately in order (B t i, B t2 , B B i, Bhi) and the computa¬ 
tional load is reduced to only 101/71/3! + 71/41/3! + 41/21/2! = 161 comparisons. 2 In reality 


1 In the top quark rest frame, in a large fraction of events, one of the decay products from t —>• bW + —y bjj 
carries low transverse momentum and thus fails to be reconstructed as a jet. As this paper is concerned 
with boosted top quarks from a heavy resonance decay, this problem does not occur. 

2 If the Higgs buckets are filled before the top buckets, the combination is further reduced to 10!/8!/2! + 
81/61/2! + 61/31/3! = 93. However, it would increase the wrong assignments for both the signal and 
background. See also discussion in Section 


IV B 
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the number of jets will often be larger than the minimum of ten, where the speedup indi¬ 
cated here becomes even more prominent. A detailed description of our specific algorithm 
is given in the next section. We address possible issues related to the explicit decoupling of 
the buckets in Section I1VB1 


III. BENCHMARK SCENARIO: TEN-JET FINAL STATE FROM VECTORLIKE 
TOP PAIR PRODUCTION 


A. Model and event generation 


In this section, we illustrate our approach by using a simple model with a vectorlike top 
partner. We extend the SM Lagrangian by adding a vectorlike top T that interacts with the 
SM top t and Higgs H 3 

C = £ S m + T(ilp - m T )T + y T HtT + h.c. . (6) 


We assume that the vectorlike top decays exclusively to a top and a Higgs. The mass of the 
vectorlike top in consideration is mx = 0.8 — 1.2 TeV. The mass of the top is taken to be 
173 GeV. The SM Higgs has mass 126 GeV and decays to bb with branching ratio 56 %. 

MadGraph5_aMC@NLO 2.2.1 [30] is used for generating parton-level events, which 
undergo hadronization and showering through PYTHIA 6.426 [21]. DELPHES 3.1.2 [ 22 ] with 
parameters tuned to the ATLAS detector is used for fast detector simulation. 


The relevant SM background processes for our analysis (cf. Sec. Ill B) and their respec¬ 
tive NLO K-factors are pp —> tt (1.61 [32]), pp —>■ ttbb (1.77 [ 23 ]), PP —> ttH (1.10 [35]). 
and pp —* bbbb (1.40 [36]). All final-state top quarks and Higgs bosons are decayed 
hadronically within MadGraph5_aMC@NLO. The following generator-level cuts are im¬ 
posed: minimum transverse momentum of each outgoing parton p± > 20 GeV, angular 
separation between outgoing light quarks and between a light quark and a bottom quark 
A Rjj, A Rjb > 0.2, and angular separation between a pair of bottom quarks A> 0.4. The 
latter cut is imposed to guarantee sufficient b separation to employ statistically independent 
b quark tagging. The overall scalar transverse momentum is imposed H^ lton lcvel > ITeV, 


3 In general, there can also be a model-dependent term XHtj^T + h.c. in the Lagrangian. Here we assume 
A = 0 for simplicity. 






consistent with a similar (but stronger) cut at analysis level, cf. Eq. (| 8]). 4 The cut on 
H parton level g uaran t ees a reasonably large fraction of events in the signal regions. Note that 
this parton level cut on Hx makes it difficult to generate events at NLO, because it acts 
differently on processes with additional jets at matrix element level (the set of partons which 
contribute to the sum is different). Matching of matrix element with additional jets is also 
difficult for the same reason. Therefore, we generate background events at LO without 
matching to higher multiplicities at matrix element level. Thus, the absolute numbers of 
the background events should be taken with a grain of salt. The generated signal events do 
not suffer from this approximation. 

B. The analysis 

We present an analysis that aims to identify the fully hadronic final state ttHH from 
vectorlike top pair production. We do not rely on large-radius “fat” jets and their substruc¬ 
ture, which has become a standard approach whenever boosted heavy particles are involved. 
Conversely, the approach presented here focuses on separately resolved (small-radius) jets 
and is intended as a proof-of-concept in a realistic and relevant process. 

The proposed analysis consists of the following steps, each of which is described in detail 
in the remainder of this subsection. 

1. Event preselection cuts: 

Scalar transverse momentum Hx > 1400 GeV and number of 6 -tagged jets 7^6 > 4. 

2. Jet reconstruction and cut #jets > 10. Here, we use several different benchmark 
algorithms including the mass-jump algorithm. 

3. Assignment of jets to the four buckets B t i, B t 2 , Bhi, and Bh 2 and cuts. 

4. Kinematic reconstruction of the vectorlike tops, depending on the number of identified 
top and Higgs buckets. 

4 To determine the respective cross-section at large scalar transverse momentum, we cut on events generated 
with £T£ arton lcvcl > 500 GeV to achieve better accuracy. Only for plotting we also generate ti and bbbb 
events with H* alton level > 1.2 TeY. 
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1. Event preselection 


The decay cascade of a heavy vectorlike top pair leads to an energy deposit of H T ~ 
0(2m,T ) in the detector. H? is the scalar transverse momentum, defined as 

Ht=J2p±- (7) 

jets j 

We require 

H t > 1400 GeV ( 8 ) 

to retain the majority of signal events for a vectorlike top with '% ~ 1 TeV while strongly 
suppressing all non-resonant background processes. In the event preselection, jets are clus¬ 
tered with the anti-hr algorithm as implemented in FastJet 3.0.6 pa with parameters 
R = 0.4 and p± > 20 GeV. Note that the jets are reconstructed differently after the prese¬ 
lection. 

As the signal process contains six b quarks in the final state, we also cut on the number 
of bottom tags. Tagging is performed by Delphes using the jets defined above. We select a 
conservative working point where 70% of 6 -initiated jets are identified correctly, e tag = 0.70, 
and assume the mistag rates of charm-initiated jets to be e^j s = 0 . 10 , and = 0.01 for 

light (quark- or gluon-initiated) jets. Cutting on 

#6 > 4 (9) 

reduces the relevant backgrounds to 6 -rich processes with high multiplicity, pp —* tt, pp —> 
ttbb , pp —» bbbb, and pp —* tiH. 

2. Jet reconstruction 

Jets are reconstructed from all calorimeter towers that lie within \rj\ < 4.9. To avoid 
“chopped” jets at the boundary of the detector, we require r/j et < 4.0 so that all jets 
are sufficiently central. The key ingredient to this analysis is the choice of jet clustering 
algorithm. In our study, we adopt the following benchmark algorithms and compare them: 

• A C/A-like mass-jump clustering algorithm with parameters 


[MJ06] : 

(R = 0.6, 

p l > 25 GeV, 

e = o.7, 

p = 50 GeV). 

(10) 

[MJ10] : 

(R = 1.0, 

P± > 25 GeV, 

e = o.7, 

p = 50 GeV). 

(11) 
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A standard setup with the Cambridge-Aachen algorithm and commonly used clustering 
parameters, 


[CA03] : 

(R = 0.3, 

p± > 25 GeV). 

(12) 

[CA04] : 

(R = 0.4, 

p x > 25GeV). 

(13) 


The minimum jet p± is set to be the same to allow for easy comparison of the results. The 
additional veto parameters specific to mass-jump clustering, 6 and p, in Eqs. (10) and ([TTj) , 
are motivated by the results obtained from boosted top quarks [6]. The mass-jump veto 
leads to jets whose effective radius can vary, and it is this inherent flexibility that will lead 
to improved results compared to standard jet clustering with fixed angular size. 

Because some jets reconstructed with the mass-jump algorithm may experience a very 
large effective radius, 5 contamination from pile-up and underlying event can pose problems 
in a realistic environment. We therefore apply a trimming [38] stage. For each jet j, the 
constituents are re-clustered with a smaller radius -Rtrim, yielding hard and possibly also soft 
subjets. The jet is then re-built only from the subjets i that are hard enough, 


Pl,i > /trim P±,j ■ (14) 

We choose i? t rim = 0.2 and / tr i m = 0.03 as suggested in Ref. [3.8]. Trimming is applied to all 
the benchmark points, MJ06, MJ10, CA03, and CA04. 

After the jets are reconstructed, we require 


#jets > 10, 


(15) 


three for each top quark and two for each Higgs boson. 


3. Bucket construction and tagging 

In order to keep the combinatorial choices of this multi-jet process at a manageable level, 
we make use of the idea of buckets M- First of all, the first top bucket B n is filled with 
the three jets that minimize 

A = |m bucket — m t \ . (16) 

5 This effect will be investigated later, cf. Fig. Ml 
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Here and for all other buckets, we limit the allowed jet combinations to those which fulfill 

P_l, bucket > 200 GeV. (17) 

This prevents wrong assignments from widely separated low-energy jets, which are possible 
due to the sheer number of possible choices. In addition, only combinations with minimum 
mutual jet separation 

AR(j,j)>0.3 (18) 


are considered, because smaller distances cannot reasonably be resolved by the hadronic 
calorimeter any more. This cut is also consistent with the cuts applied on generator level, 
cf. Sec. |III A[ Note that we do not impose an upper cut on angular spread of the top decay 
products, as is implicitly done in all substructure methods which rely on a fat jet of fixed 


radius. Also note that Eq. (18) does not restrict the analysis if all jets are mutually separated 
by more than A R(j,j) = 0.3, i.e. the fixed-A setups CA03 and CA04 are unaffected. If two 
top subjets are very close-by and merge in the CA03 setup, even in the ideal case that the 
MJ algorithm can resolve them separately, they could not contribute to the same bucket. In 
this sense the cut helps to allow a fair comparison between the mass-jump setups and the 
Cambridge-Aachen setups. 

After the first top bucket has been fixed, out of the remaining jets the second top bucket 
B t2 is filed with three jets, then the first Higgs bucket B H i with two jets, and finally B H2 
again with two jets. This course of action corresponds to a global metric with explicitly 
decoupled buckets as defined in Eqs. Q and ([5]). Again for each bucket, out of all possible 
jet combinations that fulfil Eqs. 0 and fll8|), the combination with minimum metric 


(Eq. (16), where m t is replaced by m# for Higgs buckets) is selected. Events where there is 
no viable jet combination for a bucket are negligibly rare. Remaining jets are assigned to a 
fifth bucket -Bisr and not further considered in our analysis. 

Only after all buckets have been filled, cuts are applied. For top candidate buckets, we 
require 


A < 25 GeV, 
mw \ rn\y 


m t / bucket 

m 23 

m t / bucket 


± 15% , 


m t 
> 0.35. 


(19) 

( 20 ) 

( 21 ) 
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SRI 

SR2 

SR3 

number of tagged t 

= 1 

= 2 

= 2 

number of tagged H 

= 2 

= 1 

= 2 


TABLE I: The three signal regions. 


Eq. (19) is a simple cut on the reconstructed top mass. The mass ratio in the left-hand 


side of Eq. (20) is constructed from the two jets which best reconstruct the W boson mass 


(mw, bucket) and the total jet mass of the bucket (m^bucket), as proposed when the bucket 


algorithm was introduced in Ref. [8|. The final cut in Eq. (21) was introduced in the 
HEPTopTagger [7], where 777-23 is the combined mass of the two sub-leading jets in the 
bucket (in terms of pj_). In our study, it helps to suppress top candidates whose momentum 
is dominated by one very hard prong. 

Higgs candidate buckets have to fulfill 


A < 20 GeV . 


( 22 ) 


The 4-momentum of the successful top or Higgs candidate is given by the momentum sum 
of the jets inside the bucket. 


4- Signal regions and kinematic reconstruction 

We define three signal regions depending on the number of tagged buckets, see Tab. [T| In 
addition to event rates, we kinematically reconstruct the vectorlike top from the momenta 
of a tagged top quark and a Higgs boson to assess its invariant mass 

M(t,H) = V(Pt + P H ) 2 - (23) 

In the case of a fully reconstructed event (SR3), we choose between the two possible pairings, 
{(ti, Hi), (t 2 , H 2 )} and {(fi, H 2 ), (t 2 , Hi)}, such that the mass difference of the two vectorlike 
tops is minimal, 

min [| M(h, Hi) - M(t 2 , H 2 ) |, H 2 ) - M(t 2 , . (24) 

The majority of events, however, falls into signal regions 1 and 2, and we are left with 
three tagged and one untagged bucket. As the untagged bucket also contains a significant 
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energy deposit, its momentum can be used as an estimate for the fourth particle. Again we 


apply Eq. (24) to determine the correct pairing. Only the vectorlike top that is reconstructed 


from two tagged buckets is further considered. 


Results 


The cut flow and expected event numbers at the LHC14 with 100 fb 1 integrated lumi¬ 
nosity are shown in Tab. [TT] for two benchmark setups MJ10 [Eq. ©] and CA03 [Eq. (JT2|)] . 
For both setups, the TT signal outnumbers the SM background up to rrix = 800 — 900 GeV 
in signal region the SR2, and up to my = 1.1 TeV in the signal regions SRI and SR3. Note 
that the absolute numbers should be taken with great care due to the simplified event gen¬ 


eration setup, cf. Sec. Ill A A comparison of the relative significances between the employed 
algorithms is less affected by the uncertainties, though. 

We observe that event numbers are largest in SR2 (2 tagged top quarks, 1 tagged Higgs 
boson). This is particularly pronounced for the various background processes. It can be 
understood by the order in which the four buckets are filled: The top buckets are hlled 
first and reconstruct the truth partons very well, as will be investigated in Sec. |IV| If jets 
originating from a Higgs boson are wrongly assigned to a top bucket, it becomes unlikely to 
fill both Higgs buckets from the remaining jets with masses within the mass window. This 
effect is larger for background processes, among which only a vanishing fraction contains 
actual Higgs bosons at parton level except the ttH background. Thus the Higgs buckets are 
dominantly hlled from the remaining unrelated jets. 

As can be seen in Tab. |TT| in the conventional clustering setup CA03, event numbers 
are considerably smaller than those obtained with mass-jump clustering MJ10, both for the 
signal and SM backgrounds. This is already observable at the #jets > 10 cut stage, and 
the difference becomes even larger when events in the final signal regions are compared. 
Due to the fixed jet radius of CA03, hard prongs that are separated by a distance smaller 
than R = 0.3 merge, and it is easily understood that the number of hard jets is naturally 
smaller than the one obtained from a (reasonable) mass-jump setup. As our implementation 
of the bucket algorithm explicitly requires resolved constituent jets, those merged jets fail 
to reconstruct their hard resonance, leading to a large drop in event numbers in all signal 
regions. 
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Process 

TT 

b.g. 

bbbb 

tt 

ttbb 

ttH 


800 GeV 

900 GeV 

1.0 TeV 

1.1 TeV 

1.2 TeV 







number of events for 100 fb 1 


H t > 1.4 TeV 

507 

306 

167 

86.7 

43.6 

25600 

4130 

20600 

772 

52.9 

#6 >4 

356 

217 

118 

60.8 

30.6 

1730 

990 

506 

218 

16.4 

MJ10 

#jets > 10 

306 

185 

101 

52.7 

26.8 

518 

166 

201 

141 

9.5 

SRI 

14.9 

10.4 

6.4 

3.8 

1.9 

3.3 

0.5 

1.1 

1.5 

0.1 

SR2 

36.5 

20.7 

11.5 

5.7 

2.7 

22.2 

1.8 

11.4 

8.1 

0.9 

SR3 

10.5 

6.0 

3.9 

2.2 

1.0 

2.1 

0.1 

1.1 

0.8 

0.1 



#jets > 10 

282 

172 

90.6 

45.6 

22.9 

392 

121 

145 

118 

7.7 

CA03 

SRI 

8.4 

8.0 

4.7 

2.4 

1.2 

2.4 

0.4 

1.0 

0.9 

0.1 


SR2 

24.4 

14.1 

6.8 

3.6 

1.8 

11.1 

0.8 

5.3 

4.5 

0.5 


SR3 

5.6 

2.9 

1.8 

1.1 

0.4 

0.8 

0.0 

0.4 

0.4 

0.1 


TABLE II: Expected event numbers for two benchmark setups, mass-jump clustering MJ10 
[Eq. @] and standard Cambridge-Aachen clustering CA03 [Eq. (|12[)]. Numbers are given for 
an integrated luminosity of 100 fb -1 at the LHC with \/s = 14TeV. Results for the signal are 
shown separately for different values of the vectorlike top mass ranging from 800 GeV to 1.2 TeV. 
All relevant background processes as well as their sum (“b.g.”) are given in the right-hand columns. 
The three signal regions (SR) are defined in Table [Tj 


In Fig. [2j we show the distributions of the vectorlike top mass, where stacked histograms 
of all three signal regions SRI - SR3 are presented. (In SR3, each event gives two entries.) 
The kinematic reconstruction of the vectorlike top works very well, as manifest in a clear 
peak in the figures. 

In order to compare the different jet clustering setups, it is instructive to look at signal 
significance S'/\/R, which we take from the number of signal events S and number of back¬ 


ground events B summed over all three signal regions. Numbers are given in Tab. Ill for 
all considered setups. It is observed that among the standard clustering setups CA03 and 
CA04, the smaller jet radius yields better results. The reason is that nearby prongs can only 
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FIG. 2: Reconstructed mass of the vectorlike top M(t , H ) (for truth rriT = 1 TeV and 1.1 TeV) with 
the MJ10 setup for SRI (top left), SR2 (top right) and SR3 (bottom), for an integrated luminosity 
lOOfb -1 . The histograms are stacked. 

be separately resolved if the radius parameter is smaller than the mutual separation. For 
mass-jump clustering MJ10 and MJ06, the opposite is true: the larger maximum jet radius 
gives more significant results. Even in very busy final states some prongs end up fairly iso¬ 
lated, and they are more accurately reconstructed with larger jets. Overall the mass-jump 
algorithm outperforms the fixed-radius conventional clustering. 

The reconstructed signal mass (for truth rrir = 1 TeV) is shown in Fig. [3] for all setups. 
A peak is visible for all jet clustering setups, but for the fixed-radius algorithms CA03 and 
CA04 it is shifted to lower values in the SRI and SR2. The reconstruction is worst for the 
CA03 setup. Only the analysis based on the mass-jump clustering can reproduce the mass 
of the heavy T in all signal regions. Independent of the specific clustering algorithm, the 
reconstructed mass peak has an edge around the true mass, with the majority of events 
experiencing a lower value. This may be due to the fact that we do not explicitly include 
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s/Vb 

800 GeV 

900 GeV 

1.0 TeV 

1.1 TeV 

1.2 TeV 

MJ10 

11.77 

7.05 

4.13 

2.22 

1.07 

MJ06 

11.38 

6.96 

4.06 

2.16 

1.02 

CA03 

10.17 

6.63 

3.49 

1.86 

0.90 

CA04 

11.06 

5.91 

3.36 

1.51 

0.61 


TABLE III: Comparison of significance S/\Z~B (number of signal events S and number of back¬ 
ground events B summed over all three signal regions) for different jet algorithms and benchmark 
setups. 





FIG. 3: Reconstructed T mass (for truth mj- = ITeV) in SRI (top left), SR2 (top right) and SR3 
(bottom) for different jet algorithms, for an integrated luminosity 100 fb _1 . 

the leading gluon emission when the buckets are reconstructed. 

Possible explanations for these observations and a comparison between standard 
Cambridge-Aachen and the mass-jump jets are given in the following subsection. 
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D. 


Comparison of jet clustering algorithms 




FIG. 4: A R = \fdij of the last recombination step in the hardest (left) and tenth-hardest jet (right) 
in signal events with mx = 1 TeV (arbitrary units). The solid lines depict values for jets clustered 
with the C/A-like mass-jump algorithm (MJ10), whereas jets clustered with the conventional C/A 
algorithm (CA03) are given by dashed lines. 


The results found in the previous subsection have mixed implications for the ideal jet 
radius when standard fixed -R clustering is employed. CA03 yields larger overall significance 


than CA04, cf. Tab. Ill This is not surprising, as only a small radius can separately resolve 
hard prongs from boosted top and Higgs decays. In terms of event numbers, this advantage 
seems to well compensate for possible splash-out, a loss of final-state radiation that falls 
outside the cone. On the other hand, Fig. [3] shows that the reconstruction of the vectorlike 
top mass works better with a larger radius. This shows the difficulty to find an optimal 
radius R in the fixed-if clustering algorithms. 

Instead of employing a fixed clustering radius, the mass-jump algorithm was designed to 
separately resolve hard prongs at any distance scale if the terminating veto is called. Fig. [4] 
shows the angular distance A R = \fdij of the last recombination step in the hardest (left) 
and tenth-hardest jet (right). Whereas for CA03 jets (dashed lines) the A R distribution 
peaks at the radius cut R = 0.3 or slightly below, MJ10 jets (solid lines) observe much more 
variety. For the hardest jet, A R has a peak at around 0.25 but can also take a large value, 
and for the tenth-hardest jet it has a broader and almost flat distribution. This inherent 
flexibility constitutes the key to reconstructing the busy final state considered here. 

The tail up to very large values in A R seen for the mass-jump tenth (soft) jets (Fig. [4] 
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r.h.s.) may be a relic of the algorithm. In terms of significance, it was nevertheless observed 
that the overall performance is improved when such large radii are allowed. 6 These large- 
area jets can gather additional soft radiation, e.g. soft gluon emissions, which can lead to a 
more accurate bucket mass. The fixed -R setups CA03 and CA04 do not have these features, 
which could explain why the reconstructed T mass in SRI and SR2 is shifted to lower values 
for these algorithms. We speculate that a dedicated study of this effect may lead to improved 
taggers in this context, but it is beyond the scope of this paper. 



O 20 40 60 80 100 

mje, [GeV] 



O 20 40 60 80 IOO 

m jet [GeV] 


FIG. 5: Trimmed jet mass of the hardest (left) and tenth-hardest jet (right) in signal events with 
rriT = 1 TeV (arbitrary units). The solid lines depict values for jets clustered with the C/A-like 
mass-jump algorithm (MJ10), whereas jets clustered with the conventional C/A algorithm (CA03) 
are given by dashed lines. 


Fig- IH shows the trimmed jet mass, again for the hardest (left) and tenth-hardest jet 
(right). A fraction of events experiences a very heavy leading jet around rrij = 70 ~ 80 GeV 
in the CA03 setup, indicating that nearby hard prongs have merged. The leading mass- 
jump jet, on the other hand, has a cutoff at rrij = fi = 50 GeV due to the veto condition 


(cf. Sec. II A), and very large jet masses are absent. As plain jet mass roughly scales with 
p i_ • R , soft jets clustered with fixed-A algorithms tend to be very light, as shown in the 
right panel of Fig. [5] for the CA03 setup. However, final-state radiation of low-p_L jets is less 


6 This improvement is diminished due to trimming. By including a trimming stage, we assume that our 
results are not affected much if additional soft radiation from underlying event and pile-up are taken into 
account. These effects should be included in a realistic study, but pile-up can only be reliably simulated 
by the experimental collaborations. We assume that our results, in particular the comparison between 
conventional jet clustering and mass-jump clustering, are still qualitatively valid in our simplified setup. 
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collimated and ideally caught in jets with larger radius [39]. In the same figure, it is seen 
that the tenth (soft) MJ10 jets are heavier due to the larger effective jet radius as observed 
in Fig. [4] 


We conclude that the results found in Sec. Ill C namely that a small jet radius can be 
of advantage for conventional fixed-/? clustering algorithms, whereas mass-jump clustering 
benefits from a very large maximum /?, can be explained by looking at jet merging scales 
and mass distributions. For our process including four boosted resonances and a very busy 
final state, it is essential to find jets with a flexible algorithm. The mass-jump algorithm 
avoids the problem of searching for a good compromise for the fixed jet radius parameter and 
leads to physically more appealing jets. Consequently, it generally outperforms its standard 
fixed-/? counterpart, the Cambridge-Aachen algorithm, in the phenomenological analysis. 


E. On fat jet contamination 

In this subsection, we briefly discuss the expected performance of algorithms using fat jets 
in the present process, pp —>• TT —>• ttHH —> 10 jets. In Sec. [IJ we argued that the fat jets are 
not well-separated in such a busy hadronic final state, and this problem is illustrated in Fig. [l] 
for the signal process with vectorlike top mass mj> = 1 TeV. The figure shows the angular 
distributions of the ten partonic (anti)quark final state (Monte Carlo truth) daughters. The 
black solid (dashed) line shows the distribution of the largest angular distance between the 
truth daughters of one top quark (Higgs boson) found in an event. The smallest distance 
between any truth daughters not coming from the same mother resonance is given by the 
red line. It is observed that the distance between the nearest daughter particles coming from 
different mother resonances, A/?(NN), is typically smaller than the angular spread of a t or 
H decay, A/? max (t/H). As a result, the fat jets will be contaminated. 

To be more specific, we take the default fat jet clustering parameters of the widely used 
HEPTopTagger [7] 

Cambridge-Aachen : /? fatjet = 1.5 and p 1 ^ jet > 200 GeV (25) 

and give some concrete results for the process pp —» TT —» ttHH —» 10 jets (ttit = 1 TeV) 
in Fig. [6] (CA15, upper panels). A different choice of parameters is suggested in comparisons 
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between boosted top tagging algorithms [Tj, 

anti-hr : i? fat jet = 1.0 and p 1 ^ 1jct > 200 GeV, (26) 

and we also give the plots for these fat jets in Fig. [6] (AKT10, lower panel). 

The upper panels imply that the clustering radius of the CA15 jets is too large in this 
situation. The upper left panel shows the distribution of the number of fat jets. In more 
than 50% of events, only three fat jets are found, and even less in another 20% (Fig. [6] 
upper left). A fat jet is labelled “pure truth” t (H) if all truth daughter partons of one top 
quark (or Higgs boson) and no other truth daughter partons are ghost-associated [40].' In 
each bin, the fraction of pure truth t (H) fat jets are represented by the hatched (black) 
area. One can see that only a relatively small fraction of the fat jets are pure truth ones. A 
phenomenological study would have to rely on events with only three fat jets, and the plain 
jet mass of those leading fat jets is depicted in the upper right panel. There is a large tail 
towards very large jet masses, which suggests that there is a significant amount of splash-in 
from jets not coming from the same t or H resonance. 

The second setup (AKT10) with a smaller-radius fat jets, shown in the lower row of 
Fig. [6j behaves better in this respect. In roughly 40% of events the correct number of four 
fat jets is identified (lower left), although three-jet events are still dominant. For the events 
with three fat jets, the leading jet mass is shown in the lower left panel. It can be seen that 
the distribution still shows a tail, but now large jet masses are much less present than in 
CA15 jets, implying fewer contamination through splash-in. On the other hand, when we 
compare the fraction of pure jets in events with four fat jets between the two setups, we 
observe that AKT10 jets behave worse: While almost 75% of the respective CA15 fat jets 
are pure, this number is degraded to 65% for AKT10. 

We conclude that the study of this process is difficult if we rely on fat jets. 7 8 The problem 
of insufficient separation of the boosted resonances (t and H) is not generically avoided even 
if a different fat jet radius is chosen - the smallest distance between truth daughters from 
different mothers is typically smaller than the angular spread of the top quark and Higgs 

7 The truth partons’ momenta are rescaled to infinitesimal p± and energy while rj and (f> are kept fixed 
(”ghosts"), and participate in jet clustering. Those partons that end up as constituents of a certain jet 
are called ghost-associated. Due to the vanishing energy of the ghosts, the final jets are unaffected. 

8 For niT < 900 GeV, an analysis based on fat jets can still reconstruct the vectorlike top m- An experi¬ 
mental analysis of the same process also relies on fat jets [15] . 
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FIG. 6: The upper row shows the results for CA15 fat jets: The number of fat jets is given in the 
left plot. A fat jet is labelled pure truth t (H) if all truth daughter partons of one top quark (Higgs 
boson) and no other truth daughter partons are ghost-associated. For events in which three fat 
jets were found, the distribution of the plain jet mass of the leading fat jet is shown in the right 
plot. Lower row: The same plots for AKT10 fat jets. 

boson, as shown in Fig. [T} There is no apparent solution to this contamination within fat 
jet algorithms, and the choice of clustering algorithm and parameters is related to finding 
a balance between splash-in (the fat jet contains energy deposit from a different resonance, 
too) and splash-out (the fat jet does not contain all radiation from a given resonance). As 
demonstrated in the main part of this paper, this problem is reduced when the mass-jump 
algorithm is used. 


IV. PERFORMANCE OF TOP AND HIGGS TAGGING 

In this section, we investigate the performance of top and Higgs tagging in our approach 
with the mass-jump (and the Cambridge-Aachen) clustering algorithms. We also briefly 
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comment on the metric of the decoupled buckets in Sec. IV B 


It should be emphasized that the tagging efficiencies and the quality of reconstruction 
strongly depend on the considered physical processes as well as the event generator by which 
the samples have been produced. This is even more true in our study, as the performance 
of top/Higgs tagging is affected by the hadronic activity from other top and Higgs decay 
products in the candidate’s vicinity. Tagging efficiencies and reconstruction qualities of the 
present canonical tagging algorithms for boosted resonances are usually evaluated for isolated 
fat jets (see e.g. Refs. HI El), which reduces the dependence on the specific process and makes 
it possible to compare the results between different algorithms. This condition is not satisfied 
in our benchmark analysis and therefore the results for top and Higgs tagging can hardly be 
related to other algorithms. In addition, the strong weighting of the global buckets metric 
in Eq. (J4]) naturally leads to the Erst top bucket being much better reconstructed than the 
second. Similarly, the reconstruction quality of the top quarks is generally better than that 
of the Higgs bosons. 

Despite those precautional warnings, the results presented here can serve as a benchmark 
for other processes with a similarly busy final state. 


A. Reconstruction quality 


In Fig. [7] we assess the quality of momentum reconstruction of the tagged buckets for 
the preferred MJ10 setup, in the benchmark process pp —» TT —>■ ttHH —> 10 jets at the 
LHC with -y/s = 14 TeV. The reconstructed masses of the top quarks and Higgs bosons are 
shown in the upper row. Due to the ordering in the metric, the first bucket always gives 
a better reconstructed mass, leading to the dip for the second bucket. Both for top and 
Higgs candidates, there is a central peak at the true mass value for the first buckets. The 
top mass peak is much narrower, which is not surprising since Higgs buckets are filled by 
the remaining jets only after the two top buckets have been filled. The middle and lower 
rows of Fig. [7] show the deviation between the bucket momenta and the MC truth parton 
momenta in terms of two variables, 


reco parton 


A p ± _p T l co - p v l 


^reco 

P± 


ry-.reco 

P± 


and 


A R 


reco, part on ■ 


(27) 

(28) 
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A narrow peak around 0 is observed for both observables, for top as well as Higgs candidates. 
In fact, most of the tagged buckets are reconstructed within 20% in Ap±/p r l co , and A R < 0.1, 
with no significant differences between the respective first and second buckets. We conclude 
that (i) the tagged buckets are built from the correct jets, and that (ii) these jets reconstruct 
the truth partons’ momenta very well. 

For completeness, we also show the same results for the standard clustering benchmark 
setup CA03 in Fig. [8] Note that, although the distributions look similar to the MJ10 setup, 
the total number of tagged buckets is significantly smaller. The reconstructed Higgs bosons 
tend to have a broader peak, shifted to lower values in the CA03 setup. As a result, the 
mass of the reconstructed vectorlike top is also shifted to lower values, as has been observed 
in Fig. [3j This is another indication that the mass-jump algorithm is better suited to this 


analysis, in addition to larger event numbers discussed in Sec. IIIC 


We conclude this subsection with the reconstruction quality of tagged top buckets from 
the dominating tt SM background (in the MJ10 setup), which is shown in Fig. [9] Devia¬ 
tions from the MC truth partons are very small and our analysis setup is well suited for the 
background processes as well. We observe that, although transverse momentum is recon¬ 
structed generally very accurately, the buckets tend to have lower values than the signal case 
(cf. Fig. [7]). Final-state radiation off the boosted top quark may escape from the respective 
top bucket, and additional hard prongs from the matrix element that could lead to splash-in 
are not present. 


B. A note on the global metric 

The reader might wonder whether the jets are not optimally assigned to the buckets due 
to the explicitly decoupled metric in Eqs. Q and (§• This choice was made to reduce the 
combinatorial workload, but it is expected that the results does not change much even if a 
more democratic ordered metric, 0 < cuj+i/cu, < 1, is used. First, we observe that for any 
bucket the exchange of a jet with one from B\sr does not yield a lower metric by definition, 
independent of its weights. Secondly, interchange of jets between two buckets (Hj and Bj 
with lO i > 00j) may lower the measure of Bj, but always at the cost of raising that of Hj. 
Because of the relative weight 0 Ji > ojj, most of the interchanges are likely to increase the 
global measure. To find the global minimum, one has to consider a re-assignment of several 


24 



top buckets 


Higgs buckets 



FIG. 7: Reconstruction quality of tagged top (left) and Higgs buckets (right) for the MJ10 setup, 
in the benchmark process pp TT —> ttHH —>• 10 jets at the LHC with y/s = 14TeV (arbitrary 
units). From top to bottom, the reconstructed mass, relative deviation in transverse momentum 
Ap±/p r f co , and the angular distance Ai? rec0)Parton are shown. The solid curves show results for the 
first bucket, the dashed curves for the second bucket. 

jets simultaneously, the details of which depend on the specific weights chosen and is beyond 
the scope of this paper. Note that, even if finite weights are used, the local minimum found 
with explicitly decoupled buckets gives an upper bound on A^ in , thus helping to reduce the 
huge number of permutations. 
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top buckets 


Higgs buckets 



FIG. 8: The same as Fig. [7] but for the CA03 setup. 

This argument is weakened if a metric is chosen that does not favour top buckets over 
Higgs buckets, i.e. ^ In our analysis, we chose to reconstruct top quarks from 

three prongs first, and only after that Higgs bosons from two prongs each. This order reduces 
wrong assignments for both the signal and background. Since SM processes containing Higgs 
bosons in the final state are rare, the mass distributions in the Higgs buckets can serve as 
side bands to experimentally determine the background cross-sections. 
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FIG. 9: Reconstruction quality of tagged top buckets of the leading tt background events in the 
MJ10 setup. 

V. SUMMARY AND OUTLOOK 

We have presented a novel approach to the very busy all-hadronic final state emerging 
from multiple heavy resonances, focusing on vectorlike top pair production at the LHC as 
the benchmark of our studies. Since the standard techniques using large-radius fat jets 
suffer from splash-in contamination and jet overlap in such a busy environment, in this 
paper we completely relied on separately resolved jets. It was shown that this approach 
- in combination with a bucket algorithm to reduce computational weight - gives good 
results and can serve as an alternative channel in new physics searches, including a kinematic 
reconstruction of the vectorlike top mass. The key ingredient is the mass-jump jet clustering 
algorithm, which is shown to greatly improve the performance compared to common jet 
algorithms. This algorithm, which established the family of jet clustering with a terminating 
veto, is able to resolve nearby hard partons into separate jets, while it resembles common 
jet algorithms if the partons are well-isolated. In addition to intrinsic jet properties, it 
introduces a dependence of the clustering history on two-jet properties, all formulated in 
terms of jet mass and mass ratios. It is this flexibility that outputs jets with variable effective 
radii, which leads to superior results compared to the fixed-radius variants. 

While a y 2 -like measure could give a more accurate assignment of the jets to the various 
buckets, we gave an argument that the difference to our computationally inexpensive ansatz 
is not expected to be large. Another possible improvement is to require a certain number 
of 6-tagged jets for each top and Higgs candidate. We did not include this option in our 
analysis because bottom tagging is difficult in such busy final states, and also it would require 
matching between tagged jets and mass-jump jets, which have not yet been investigated by 
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the experimental collaborations. Our results give a conservative estimate in this respect. 

On top of the phenomenological study of vectorlike top pair production, we investigated 
the quality of reconstruction of the top quarks and Higgs bosons. Whereas the majority 
of tagging algorithms for boosted resonances assumes their isolation, we showed that our 
approach performs excellently in identifying the correct jet combinations even in this very 
busy and unclean environment. This study enters uncharted and often neglected territory 
when it comes to taggers, yet the results are promising and we expect that jet clustering 
algorithms with a terminating veto will find their place in future studies of high-multiplicity 
processes. The algorithm dubbed ClusteringVetoPlugin is publicly available in the Fast- 
Jet contributions package m 
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